The search functionality is under construction.

Author Search Result

[Author] Takao ONOYE(65hit)

21-40hit(65hit)

  • Implementation of Viterbi Decoder toward GPU-Based SDR Receiver

    Kosuke TOMITA  Masahide HATANAKA  Takao ONOYE  

     
    PAPER

      Vol:
    E98-A No:11
      Page(s):
    2246-2253

    Viterbi decoding is commonly used for several protocols, but computational cost is quite high and thus it is necessary to implement it effectively. This paper describes GPU implementation of Viterbi decoder utilizing three-point Viterbi decoding algorithm (TVDA), in which the received bits are divided into multiple chunks and several chunks are decoded simultaneously. Coalesced access and Warp Shuffle, which is new instruction introduced are also utilized in order to improve decoder performance. In addition, iterative execution of parallel chunks decoding reduces the latency of proposed Viterbi decoder in order to utilize the decoder as a part of GPU-based SDR transceiver. As the result, the throughput of proposed Viterbi decoder is improved by 23.1%.

  • VLSI Architecture of Switching Control for AAL Type2 Switch

    Masahide HATANAKA  Toshihiro MASAKI  Takao ONOYE  Koso MURAKAMI  

     
    PAPER

      Vol:
    E83-A No:3
      Page(s):
    435-441

    This paper presents the switching control and VLSI architecture for the AAL2 switch. The ATM network with the AAL2 switch can efficiently transmit low-bit-rate data, even if the network has many endpoints. The switch is capable of not only switching AAL2 cells but also converting the header of other types of ATMs. The AAL2 switch is integrated into a single chip. The proposed ATM network is constructed by AAL2 switches attached to the ATM switches.

  • Thermal-Comfort Aware Online Co-Scheduling Framework for HVAC, Battery Systems, and Appliances in Smart Buildings

    Daichi WATARI  Ittetsu TANIGUCHI  Francky CATTHOOR  Charalampos MARANTOS  Kostas SIOZIOS  Elham SHIRAZI  Dimitrios SOUDRIS  Takao ONOYE  

     
    INVITED PAPER

      Pubricized:
    2022/10/24
      Vol:
    E106-A No:5
      Page(s):
    698-706

    Energy management in buildings is vital for reducing electricity costs and maximizing the comfort of occupants. Excess solar generation can be used by combining a battery storage system and a heating, ventilation, and air-conditioning (HVAC) system so that occupants feel comfortable. Despite several studies on the scheduling of appliances, batteries, and HVAC, comprehensive and time scalable approaches are required that integrate such predictive information as renewable generation and thermal comfort. In this paper, we propose an thermal-comfort aware online co-scheduling framework that incorporates optimal energy scheduling and a prediction model of PV generation and thermal comfort with the model predictive control (MPC) approach. We introduce a photovoltaic (PV) energy nowcasting and thermal-comfort-estimation model that provides useful information for optimization. The energy management problem is formulated as three coordinated optimization problems that cover fast and slow time-scales by considering predicted information. This approach reduces the time complexity without a significant negative impact on the result's global nature and its quality. Experimental results show that our proposed framework achieves optimal energy management that takes into account the trade-off between electricity expenses and thermal comfort. Our sensitivity analysis indicates that introducing a battery significantly improves the trade-off relationship.

  • An In-Vehicle Auditory Signal Evaluation Platform based on a Driving Simulator

    Fuma SAWA  Yoshinori KAMIZONO  Wataru KOBAYASHI  Ittetsu TANIGUCHI  Hiroki NISHIKAWA  Takao ONOYE  

     
    PAPER-Acoustics

      Pubricized:
    2023/05/22
      Vol:
    E106-A No:11
      Page(s):
    1368-1375

    Advanced driver-assistance systems (ADAS) generally play an important role to support safe drive by detecting potential risk factors beforehand and informing the driver of them. However, if too many services in ADAS rely on visual-based technologies, the driver becomes increasingly burdened and exhausted especially on their eyes. The drivers should be back out of monitoring tasks other than significantly important ones in order to alleviate the burden of the driver as long as possible. In-vehicle auditory signals to assist the safe drive have been appealing as another approach to altering visual suggestions in recent years. In this paper, we developed an in-vehicle auditory signals evaluation platform in an existing driving simulator. In addition, using in-vehicle auditory signals, we have demonstrated that our developed platform has highlighted the possibility to partially switch from only visual-based tasks to mixing with auditory-based ones for alleviating the burden on drivers.

  • High-Level Synthesis of a Multithreaded Processor for Image Generation

    Takao ONOYE  Toshihiro MASAKI  Isao SHIRAKAWA  Hiroaki HIRATA  Kozo KIMURA  Shigeo ASAHARA  Takayuki SAGISHIMA  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E78-A No:3
      Page(s):
    322-330

    The design procedure of a multithreaded processor dedicated to the image generation is described, which can be achieved by means of a high-level synthesis tool PARTHENON. The processor employs a multithreaded architecture which is a novel promising approach to the parallel image generation. This paper puts special stress on the high-level synthesis scheme which can simplify the behavioral description for the structure and control of a complex hardware, and therefore enables the design of a complicated mechanism for a multithreaded processor. Implementation results of the synthesis are also shown to demonstrate the performance of the designed processor. This processor greatly improves the throughput of the image generation so far attained by the conventional approach.

  • Field Slack Assessment for Predictive Fault Avoidance on Coarse-Grained Reconfigurable Devices

    Toshihiro KAMEDA  Hiroaki KONOURA  Dawood ALNAJJAR  Yukio MITSUYAMA  Masanori HASHIMOTO  Takao ONOYE  

     
    PAPER-Test and Verification

      Vol:
    E96-D No:8
      Page(s):
    1624-1631

    This paper proposes a procedure for avoiding delay faults in field with slack assessment during standby time. The proposed procedure performs path delay testing and checks if the slack is larger than a threshold value using selectable delay embedded in basic elements (BE). If the slack is smaller than the threshold, a pair of BEs to be replaced, which maximizes the path slack, is identified. Experimental results with two application circuits mapped on a coarse-grained architecture show that for aging-induced delay degradation a small threshold slack, which is less than 1 ps in a test case, is enough to ensure the delay fault prediction.

  • An Embedded Zerotree Wavelet Video Coding Algorithm with Reduced Memory Bandwidth

    Roberto Y. OMAKI  Gen FUJITA  Takao ONOYE  Isao SHIRAKAWA  

     
    PAPER-Image

      Vol:
    E85-A No:3
      Page(s):
    703-713

    A wavelet based algorithm for scalable video compression is described, with the main focus put on memory bandwidth reduction and efficient VLSI implementation. The proposed algorithm adopts a modified 2-D subband decomposition scheme in conjunction with a partial zerotree search for efficient Embedded Zerotree Wavelet coding. The experiment with the performance of the proposed algorithm in comparison with that of conventional DWT, MPEG-2, and JPEG demonstrates that the image quality of the proposed algorithm is consistently superior to that of JPEG, and our scheme can even outperform MPEG-2 in some cases, although it does not exploit the inter-frame redundancy. In spite of the performance inferiority to the conventional DWT, the proposed algorithm attains significant reduction of DWT memory requirements, enhancing a reasonable balance between implementation cost and image quality.

  • Efficient 3-D Sound Movement with Time-Varying IIR Filters

    Kosuke TSUJINO  Wataru KOBAYASHI  Takao ONOYE  Yukihiro NAKAMURA  

     
    PAPER-Speech/Audio Processing

      Vol:
    E90-A No:3
      Page(s):
    618-625

    3-D sound using head-related transfer functions (HRTFs) is applicable to embedded systems such as portable devices, since it can create spatial sound effect without multichannel transducers. Low-order modeling of HRTF with an IIR filter is effective for the reduction of the computational load required in embedded applications. Although modeling of HRTFs with IIR filters has been studied earnestly, little attention has been paid to sound movement with IIR filters, which is important for practical applications of 3-D sound. In this paper, a practical method for sound movement is proposed, which utilizes time-varying IIR filters and variable delay filters. The computational cost for sound movement is reduced by about 50% with the proposed method, compared to conventional low-order FIR implementation. In order to facilitate efficient implementation of 3-D sound movement, tradeoffs between the subjective quality of the output sound and implementation parameters such as the size of filter coefficient database and the update period of filter coefficients are also discussed.

  • Hardware Architecture of the Fast Mode Decision Algorithm for H.265/HEVC

    Wenjun ZHAO  Takao ONOYE  Tian SONG  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E98-A No:8
      Page(s):
    1787-1795

    In this paper, a specified hardware architecture of the Fast Mode Decision (FMD) algorithms presented by our previous work is proposed. This architecture is designed as an embedded mode dispatch module. On the basis of this module, some unnecessary modes can be skipped or the mode decision process can be terminated in advanced. In order to maintain a higher compatibility, the FMD algorithms are unitedly designed as an unique module that can be easily embedded into a common video codec for H.265/HEVC. The input and output interfaces between the proposed module and other parts of the codec are designed based on simple but effective protocol. Hardware synthesis results on FPGA demonstrate that the proposed architecture achieves a maximum frequency of about 193 MHz with less than 1% of the total resources consumed. Moreover, the proposed module can improve the overall throughput.

  • Jitter Amplifier for Oscillator-Based True Random Number Generator

    Takehiko AMAKI  Masanori HASHIMOTO  Takao ONOYE  

     
    PAPER-Cryptography and Information Security

      Vol:
    E96-A No:3
      Page(s):
    684-696

    We propose a jitter amplifier architecture for an oscillator-based true random number generator (TRNG). Two types of latency-controllable (LC) buffer, which are the key components of the proposed jitter amplifier, are presented. We derive an equation to estimate the gain of the jitter amplifier, and analyze sufficient conditions for the proposed circuit to work properly. The proposed jitter amplifier was fabricated with a 65 nm CMOS process. The jitter amplifier with the two-voltage LC buffer occupied 3,300 µm2 and attained 8.4x gain, and that with the single-voltage LC buffer achieved 2.2x gain with an 1,700 µm2 area. The jitter amplification of the sampling clock increased the entropy of a bit stream and improved the results of the NIST test suite so that all the tests passed whereas TRNGs with simple correctors failed. The jitter amplifier attained higher throughput per area than a frequency divider when the required amount of jitter was more than two times larger than the inherent jitter in our test-chip implementations.

  • Reliability-Configurable Mixed-Grained Reconfigurable Array Supporting C-Based Design and Its Irradiation Testing

    Hiroaki KONOURA  Dawood ALNAJJAR  Yukio MITSUYAMA  Hajime SHIMADA  Kazutoshi KOBAYASHI  Hiroyuki KANBARA  Hiroyuki OCHI  Takashi IMAGAWA  Kazutoshi WAKABAYASHI  Masanori HASHIMOTO  Takao ONOYE  Hidetoshi ONODERA  

     
    PAPER-High-Level Synthesis and System-Level Design

      Vol:
    E97-A No:12
      Page(s):
    2518-2529

    This paper proposes a mixed-grained reconfigurable architecture consisting of fine-grained and coarse-grained fabrics, each of which can be configured for different levels of reliability depending on the reliability requirement of target applications, e.g. mission-critical applications to consumer products. Thanks to the fine-grained fabrics, the architecture can accommodate a state machine, which is indispensable for exploiting C-based behavioral synthesis to trade latency with resource usage through multi-step processing using dynamic reconfiguration. In implementing the architecture, the strategy of dynamic reconfiguration, the assignment of configuration storage and the number of implementable states are key factors that determine the achievable trade-off between used silicon area and latency. We thus split the configuration bits into two classes; state-wise configuration bits and state-invariant configuration bits for minimizing area overhead of configuration bit storage. Through a case study, we experimentally explore the appropriate number of implementable states. A proof-of-concept VLSI chip was fabricated in 65nm process. Measurement results show that applications on the chip can be working in a harsh radiation environment. Irradiation tests also show the correlation between the number of sensitive bits and the mean time to failure. Furthermore, the temporal error rate of an example application due to soft errors in the datapath was measured and demonstrated for reliability-aware mapping.

  • Low-Power Scheme of NMOS 4-Phase Dynamic Logic

    Bao-Yu SONG  Makoto FURUIE  Yukihiro YOSHIDA  Takao ONOYE  Isao SHIRAKAWA  

     
    LETTER-Low-Power Circuit Technique

      Vol:
    E82-C No:9
      Page(s):
    1772-1776

    An NMOS 4-phase dynamic logic scheme is described, which is intended to achieve low-power consumption in the deep submicron design. In this scheme, the short-circuit current is eliminated, and moreover, the voltage swing of transition signals is reduced, resulting in enhancing power reduction effectively. First, distinctive features of this 4-phase dynamic logic are specified, as compared with the static CMOS logic and dynamic domino CMOS logic. Then, power simulations are attempted for the 4-phase dynamic logic, static CMOS logic, dynamic CMOS logic, and pass-transistor logic, by using a number of logic modules, which demonstrate that the NMOS 4-phase dynamic logic is the most power-efficient. Moreover, through the gate delay simulation, the capability of how many transistors can be packed in a logic block is also discussed.

  • Signal-Dependent Analog-to-Digital Conversion Based on MINIMAX Sampling

    Igors HOMJAKOVS  Masanori HASHIMOTO  Tetsuya HIROSE  Takao ONOYE  

     
    PAPER

      Vol:
    E96-A No:2
      Page(s):
    459-468

    This paper presents an architecture of signal-dependent analog-to-digital converter (ADC) based on MINIMAX sampling scheme that allows achieving high data compression rate and power reduction. The proposed architecture consists of a conventional synchronous ADC, a timer and a peak detector. AD conversion is carried out only when input signal peaks are detected. To improve the accuracy of signal reconstruction, MINIMAX sampling is improved so that multiple points are captured for each peak, and its effectiveness is experimentally confirmed. In addition, power reduction, which is the primary advantage of the proposed signal-dependent ADC, is analytically discussed and then validated with circuit simulations.

  • A Single Tooth Segmentation Using PCA-Stacked Gabor Filter and Active Contour

    Pramual CHOORAT  Werapon CHIRACHARIT  Kosin CHAMNONGTHAI  Takao ONOYE  

     
    PAPER-Image Processing

      Vol:
    E96-A No:11
      Page(s):
    2169-2178

    In tooth contour extraction there is insufficient intensity difference in x-ray images between the tooth and dental bone. This difference must be enhanced in order to improve the accuracy of tooth segmentation. This paper proposes a method to improve the intensity between the tooth and dental bone. This method consists of an estimation of tooth orientation (intensity projection, smoothing filter, and peak detection) and PCA-Stacked Gabor with ellipse Gabor banks. Tooth orientation estimation is performed to determine the angle of a single oriented tooth. PCA-Stacked Gabor with ellipse Gabor banks is then used, in particular to enhance the border between the tooth and dental bone. Finally, active contour extraction is performed in order to determine tooth contour. In the experiment, in comparison with the conventional active contour without edge (ACWE) method, the average mean square error (MSE) values of extracted tooth contour points are reduced from 26.93% and 16.02% to 19.07% and 13.42% for tooth x-ray type I and type H images, respectively.

  • SOH Aware System-Level Battery Management Methodology for Decentralized Energy Network

    Daichi WATARI  Ittetsu TANIGUCHI  Takao ONOYE  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E103-A No:3
      Page(s):
    596-604

    The decentralized energy network is one of the promising solutions as a next-generation power grid. In this system, each house has a photovoltaic (PV) panel as a renewable energy source and a battery which is an essential component to balance between generation and demand. The common objective of the battery management on such systems is to minimize only the purchased energy from a power company, but battery degradation caused by charge/discharge cycles is also a serious problem. This paper proposes a State-of-Health (SOH) aware system-level battery management methodology for the decentralized energy network. The power distribution problem is often solved with mixed integer programming (MIP), and the proposed MIP formulation takes into account the SOH model. In order to minimize the purchased energy and reduce the battery degradation simultaneously, the optimization problem is divided into two stages: 1) the purchased energy minimization, and 2) the battery aging factor reducing, and the trade-off exploration between the purchased energy and the battery degradation is available. Experimental results show that the proposed method achieves the better trade-off and reduces the battery aging cost by 14% over the baseline method while keeping the purchased energy minimum.

  • Efficient Memory Organization Framework for JPEG2000 Entropy Codec

    Hiroki SUGANO  Takahiko MASUZAKI  Hiroshi TSUTSUI  Takao ONOYE  Hiroyuki OCHI  Yukihiro NAKAMURA  

     
    PAPER-Realization

      Vol:
    E92-A No:8
      Page(s):
    1970-1977

    The encoding/decoding process of JPEG2000 requires much more computation power than that of conventional JPEG mainly due to the complexity of the entropy encoding/decoding. Thus usually multiple entropy codec hardware modules are implemented in parallel to process the entropy encoding/decoding. This module, however, requests many small-size memories to store intermediate data, and when multiple modules are implemented on a chip, employment of the large number of SRAMs increases difficulty of whole chip layout. In this paper, an efficient memory organization framework for the entropy encoding/decoding module is proposed, in which not only existing memory organizations but also our proposed novel memory organization methods are attempted to expand the design space to be explored. As a result, the efficient memory organization for a target process technology can be explored.

  • Power Gating Implementation for Supply Noise Mitigation with Body-Tied Triple-Well Structure

    Yasumichi TAKAI  Masanori HASHIMOTO  Takao ONOYE  

     
    PAPER-Circuit Design

      Vol:
    E95-A No:12
      Page(s):
    2220-2225

    This paper investigates power gating implementations that mitigate power supply noise. We focus on the body connection of power-gated circuits, and examine the amount of power supply noise induced by power-on rush current and the contribution of a power-gated circuit as a decoupling capacitance during the sleep mode. To figure out the best implementation, we designed and fabricated a test chip in 65 nm process. Experimental results with measurement and simulation reveal that the power-gated circuit with body-tied structure in triple-well is the best implementation from the following three points; power supply noise due to rush current, the contribution of decoupling capacitance during the sleep mode and the leakage reduction thanks to power gating.

  • Real-Time Human Object Extraction Method for Mobile Systems Based on Color Space Segmentation

    Gen FUJITA  Takaaki IMANAKA  Hyunh Van NHAT  Takao ONOYE  Isao SHIRAKAWA  

     
    PAPER

      Vol:
    E89-A No:4
      Page(s):
    941-949

    Since a human object is an important element of the moving pictures being processed by mobile terminals, establishing a human object extraction method encourages dissemination of new applications. In accordance with the requirement of mobile applications, this paper proposes a low-cost human object extraction method, which consists of a face object and a hair object extraction based on their color information and a simple body extraction utilizing the position information of the face object. In the proposed method, skin color and hair color are estimated through color space segmentation, and a human object is effectively extracted by using a radial active contour model. Simulation results of the human object extraction with the use of XScale processor claims that QCIF 15 fps video sequences can be processed in real time.

  • FOREWORD

    Akira TAGUCHI  Takao ONOYE  

     
    FOREWORD

      Vol:
    E91-A No:10
      Page(s):
    2896-2896
  • SET Pulse-Width Measurement Suppressing Pulse-Width Modulation and Within-Die Process Variation Effects

    Ryo HARADA  Yukio MITSUYAMA  Masanori HASHIMOTO  Takao ONOYE  

     
    PAPER

      Vol:
    E97-A No:7
      Page(s):
    1461-1467

    This paper presents a measurement circuit structure for capturing SET pulse-width suppressing pulse-width modulation and within-die process variation effects. For mitigating pulse-width modulation while maintaining area efficiency, the proposed circuit uses massively parallelized short inverter chains as a target circuit. Moreover, for each inverter chain on each die, pulse-width calibration is performed. In measurements, narrow SET pulses ranging 5ps to 215ps were obtained. We confirm that an overestimation of pulse-width may happen when ignoring die-to-die and within-die variation of the measurement circuit. Our evaluation results thus point out that calibration for within-die variation in addition to die-to-die variation of the measurement circuit is indispensable.

21-40hit(65hit)